Using combinatorial optimization in model-based trimmed clustering with cardinality constraints

نویسندگان

  • María Teresa Gallegos
  • Gunter Ritter
چکیده

Abstract Statistical clustering criteria with free scale parameters and unknown cluster sizes are inclined to create small, spurious clusters. To mitigate this tendency a statistical model for cardinality–constrained clustering of data with gross outliers is established, its maximum likelihood and maximum a posteriori clustering criteria are derived, and their consistency and robustness are analyzed. The criteria lead to constrained optimization problems that can be solved by iterative, alternating trimming algorithms of k–means type. Each step in the algorithms requires the solution to a λ–assignment problem known from combinatorial optimization. The method allows to estimate the numbers of clusters and outliers. It is illustrated with a synthetic and a real data set.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Optimization of profit and customer satisfaction in combinatorial production and purchase model by genetic algorithm

Optimization of inventory costs is the most important goal in industries. But in many models, the constraints are considered simple and relaxed. Some actual constraints are to consider the combinatorial production and purchase models in multi-products environment. The purpose of this article is to improve the efficiency of inventory management and find the economic order quantity and economic p...

متن کامل

Constrained K-means with General Pairwise and Cardinality Constraints

In this work, we study constrained clustering, where some constraints are utilized to guide the clustering process. In existing work on this topic, two main categories of constraints have been explored, namely pairwise and cardinality constraints. Pairwise constraints enforce that the cluster labels of two instances be the same (must-link constraints) or different (cannot-link constraints). Car...

متن کامل

A Robust Knapsack Based Constrained Portfolio Optimization

Many portfolio optimization problems deal with allocation of assets which carry a relatively high market price. Therefore, it is necessary to determine the integer value of assets when we deal with portfolio optimization. In addition, one of the main concerns with most portfolio optimization is associated with the type of constraints considered in different models. In many cases, the resulted p...

متن کامل

بررسی عملکرد الگوریتم GRASP درانتخاب پرتفوی بهینه ( با لحاظ محدودیت کاردینالیتی

در مساله بهینه سازی پرتفوی ، مدل مارکویتز همچنان به عنوان رویکرد غالب شناخته شده است اما چون محدودیت هایی که در دنیای واقعی نظیر محدودیت تعدادداراییهای سبد یا حداقل و حداکثر مقدار هریک از داراییها در این مدل درنظر گرفته نشده است، این مدل در حل مسائل دنیای واقعی بعضا ناتوان می باشد. به همین دلیل استفاده از الگوریتم های فراابتکاری با توجه به ویژگی های منعطفی که دارند میتوانند مفید واقع شوند. در ...

متن کامل

A Multi-Objective Approach to Fuzzy Clustering using ITLBO Algorithm

Data clustering is one of the most important areas of research in data mining and knowledge discovery. Recent research in this area has shown that the best clustering results can be achieved using multi-objective methods. In other words, assuming more than one criterion as objective functions for clustering data can measurably increase the quality of clustering. In this study, a model with two ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Computational Statistics & Data Analysis

دوره 54  شماره 

صفحات  -

تاریخ انتشار 2010